Method of inferring sensor attitude through multi-feature tracking

ABSTRACT

A method for inferring precise sensor attitude information in a tracking sensor system begins with storing at a first time a reference image in a memory associated with tracking sensor. Next, the method includes sensing it a second time second image. The sensed image comprises a plurality of sensed feature locations. The method further includes determining the position of the tracking sensor at the second time relative to its position at the first time and then forming a correlation between the sensed feature locations and the predetermined feature locations as a function of the relative position. The method results in an estimation of a tracking sensor pose that is calculated as a function of the correlation. Because the method is primarily computational, implementation requires no new hardware in a tracking sensor system other than that which may be required to provide additional computational capacity.

TECHNICAL FIELD OF THE INVENTION

The present invention relates to control of electromagnetic and other types of sensor systems and, more particularly, describes a method of inferring sensor attitude through multi-feature tracking using position determination systems such as Inertial Navigation Systems (INS) together with imaging sensors such as Forward Looking Infrared (FLIR) and radio frequency (RF) emitter sensor systems.

BACKGROUND OF THE INVENTION

Accurate determinations of sensor pose (i.e., sensor position and attitude) provide essential information for passive ranging methods in sensor systems using this type of range measurement technique. Passive ranging has important practical applications in covert terrain-following and terrain-avoidance methods and in certain missile systems uses. For example, sensors that track objects and images, such as those sensors that fire-and-forget missiles possess, require accurate estimates of sensor pose during target intercept and other operations where the missile sensor moves relative to a tracked object.

To determine the sensor pose, it is necessary to accurately determine the sensor attitude. Attitude determination, however, is an ill-conditioned problem. Most tracking sensors employ image correlation and comparison methods based on local statical distributions of pixel intensity for attitude determinations. For example, adaptive gate centroid trackers use a template to correlate images that the sensor systems keeps as a "reference" in memory for this purpose.

These and other methods for accurately determining sensor pose are important for passive ranging algorithms. Passive ranging algorithms, however, are sensitive to errors that exist during range calculations. Errors in passive range calculations are in large part the result of two factors. The first factor is small errors that exist in correlating the sensed image to the reference image in the memory. The second factor is small error resulting from a decoupling that occurs between the sensor and the navigation system (e.g., an INS system) as the aircraft structure or platform containing the sensor flexes or changes shape.

Present sensor systems that perform passive ranging do not address the ill-conditioning that exists in the attitude determination problem. Thus, there is a need for a method that provides accurate attitude determination for the purpose of calculating sensor pose during passive ranging.

There is a need for a method to determine sensor attitude that overcomes the decoupling between the sensor and the navigation system that occurs as a result of aircraft or platform flexture.

Moreover, there is a need for a method and system that minimizes the effect of calculation errors that may exist during the correlation of a template or other reference in memory to the sensed image that the sensor detects.

SUMMARY OF THE INVENTION

The present invention, accordingly, provides a method of inferring sensor attitude during multi-feature tracking that overcomes or reduces disadvantages and limitations associated with prior methods for determining sensor attitude. The method and system improve the sensor pose determinations to enhance passive ranging methods for a variety of sensor systems.

One aspect of the present invention is a method for compensating a sensor pose that comprises the steps of computing from the tracker locations of multiple features of an image. The sensor pose compensation has use in passive ranging where it is difficult to measure attitude to any acceptable precision.

The steps of the method of the present invention include loading at a first time in the sensor tracker system a correlation template, statistical approximation or other reference for the image that includes one or more features for the sensor to detect. Then, at a second later time after the sensor moves the method entails relocating the features in the sensed image as sensed feature locations. Then, the method compares the sensed feature locations to the stored reference feature locations. Using the correlation between the sensed feature locations and the reference feature locations, the method of the present invention entails using the sensor tracker to determine the sensor pose. This process is continued iteratively to determine the best estimate of the sensor pose. This best estimate improves the passive ranging calculations of the sensor.

A technical advantage of the present invention is that it provides a method to improve the accuracy of passive range estimation procedures in tracking sensor systems.

Another technical advantage of the present invention is that it provides an accurate method of inferring the sensor attitude when the sensor translational motion is known and the sensor is tracking multiple features.

The present invention has application in missiles and other platforms that use information from the inertial navigation systems (INS) for input to sensor systems such as Forward Looking Infrared (FLIR) and air-to-ground radio frequency (RF) emitter locating sensors.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention and its modes of use and advantages are best understood by reference to the following description of illustrative embodiments when read in conjunction with the accompanying drawings, wherein:

FIG. 1 illustrates the world coordinate system that pertains to the reference and sensed images;

FIG. 2 provides a conceptual illustration of a conical coordinate system applicable to the preferred embodiment;

FIGS. 3 and 4 illustrate pixel arrays applicable to the preferred embodiment;

FIG. 5 illustrates the geometry of image formation of the preferred embodiments; and

FIG. 6 shows a flowchart illustrating one embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

The preferred embodiment is best understood by referring to the FIGUREs wherein like numerals and symbols and used for like and corresponding references to the various drawings.

The method of the preferred embodiment permits determining the attitude (yaw, pitch and roll) of an imaging sensor relative to its attitude at a reference time. Inputs are images at a first time and a second time and relative sensor position. The relative sensor position may be provided by an inertial navigation system (INS) or other source. The method meets the needs of passive ranging methods and systems where accurate estimates of attitude are crucial to the missile or platform performance and are difficult to obtain, particularly when the sensor and INS are not co-located. For many of these systems, however, it is not difficult to obtain estimates of relative translational motion to the required accuracy.

The preferred embodiment is best understood when considering an example in which certain assumptions hold. The first assumption is that the sensor is in or on an aircraft, such as a missile, flying over terrain. The example also assumes that image features are automatically evaluated or selected for robust tracking and that image trackers continually relocate features within the field-of-view (FOV). Furthermore, the example assumes that new features are introduced to replace those lost through motion. Finally, the example uses an assumption that features derive from stationary objects in the scene. With these assumptions, the explanation of the preferred embodiment proceeds as follows.

Feature locations are reported by the trackers as points in the image, although the features themselves are distributed objects. Interpolation techniques can provide sub-pixel resolution, but the example of the preferred embodiment assumes tracker outputs are reported as pixel locations.

The image of the tracking sensor typically is in a world coordinate system, as depicted in FIG. 1. Understanding the preferred embodiment becomes simple, however, in a canonical coordinate system that is obtainable by transforming world coordinates. FIG. 2 shows such a canonical coordinate system.

In FIG. 1, the center-of-focus (COF) is located at point p. A line from p to the fixation point f passes through the center of the imaging array and is normal to it. This determines the direction of the unit normal to the array, denoted n. The center of the imaging array is located at p+ηn, where η is the focal length. Positions within the imaging plane are defined by the 2-dimensional coordinate system (u₂,u₃), where from the perspective of the observer, u₂ is associated with horizontal and u₃, with vertical.

The canonical coordinate system of FIG. 2 may be obtained by a translation and rotation of the world coordinate system of FIG. 1. This transformation leaves invariant distances such as focal length. Note that FIGS. 1 and 2 employ similar nomenclature, with primes distinguishing quantities in the canonical system. In FIG. 2, the COF is at the origin and the fixation point is on the x₁ axis at x₁ "=∥f-p∥. Axes u₂ "-u₃ " are offset from, but aligned with, x₂ "-x₃ ".

The transformation from world to canonical coordinates consists of certain simple steps. The first step is to translate the origin to p. The system is then rotated by a yaw angle φ₃ about the x₃ -axis so that the fixation point is directly below the x₁ '-axis in the resulting intermediate coordinate system x₁ '-x₂ '-x₃. Next, the system is rotated by pitch angle φ₂ about the x₂ '-axis so that the fixation point falls on the x₁ "-axis in the resulting intermediate coordinate system x₁ "-x₂ '-x₃. The system is then rotated by a roll angle φ₁ about the x₁ "-axis so that the positive u₂ "-axis aligns with the positive x₂ "-axis of the resulting final coordinate system x₁ "-x₂ "-x₃ ".

These translational processes are representable by the matrix-vector equation:

    x"=S(x-p)                                                  (1)

where 3-vectors x and x" are locations in world and canonical coordinates, respectively. If x represents the location of a feature, it is by assumption, constant. However, its mapping into canonical coordinates, x", is time-varying due to sensor motion. The 3-by-3 rotation matrix S is given by: ##EQU1## where c_(i) =cos φ_(i) and s_(i) =sin φ_(i). An important property of S is orthonormality, that is

    s.sup.-1 =s.sup.T,                                         (3)

where S^(T) denotes the matrix transpose of S. This property makes it easy to transform between reference frames.

The Euler angles φ₂ and φ₃ may be related to the unit normal n in world coordinates using the special case of 1: ##EQU2## To solve this, transpose and multiply on the right by S. From this expression, identify:

    c.sub.2 =σ                                           (5)

    s.sub.2 =-n.sub.3                                          (6)

    c.sub.3 =n.sub.1 /σ                                  (7)

    s.sub.3 =n.sub.2 /σ                                  (8)

where ##EQU3## These results are useful when θ₁ and n are used to specify attitude.

The tracking sensor system assigns to each feature within the FOV a pixel location in the imaging array. This mapping is independent of choice of coordinate system. The physical center of this pixel is taken as the location (u₂ ",u₃ ") of the feature in the image plane. It is appropriate to assume the imaging array is rectangular with N_(r) rows and N_(c) columns. Pixels are identified by 2-tuples of the form (j,i), where j is the column index and i is the row index. The remaining details of this mapping are contained in ##EQU4## where δ₂ is the column spacing and δ₃ is the row spacing. FIGS. 3 and 4 are illustrations of very simple imaging arrays.

Referring now to FIG. 5, by similar triangles: ##EQU5## Note that (η,u₂ ", u₃ ") and (x₁ ",x₂ ", x₃ ") may have different units of measure provided each system is self-consistent. As an expedient, we take η as the unit of measure in the image plane, such that ##EQU6## If u" in Equation (14) is the mapping of x" in object space, by Equations (12) and (13)

    x.sub.1 "u"=x"                                             (15)

The effect of motion on images collected at two times, 0 and 1 (in appropriate units) is an important parameter of the preferred embodiment. The derivation is simplified by picking the coordinate system to coincide with canonical coordinates at time 0, yielding

    x"(0)=x                                                    (16)

Using Equations (1) and (15), at time 1 the following holds:

    x.sub.1 "(1)u"(1)=S(1)  x-p(1)!.                           (17)

The first row of Equation (17) defines x₁ "(1) and may be used to eliminate it in the second and third rows, with the result

     u.sub.2 "(1)s.sub.1 (1)-s.sub.2 (1)!.sup.T  x-p(1)!=0     (18)

     u.sub.3 "(1)s.sub.1 (1)-s.sub.3 (1)!.sup.T  x-p(1)!=0,    (19)

where S_(i) ^(T) (1) denotes the i^(th) row of S(1). Substituting from Equations (15) and (16) for x,

     u.sub.2 "(1)s.sub.1 (1)-s.sub.2 (1)!.sup.T  x.sub.1 u(0)-p(1)!=0(20)

     u.sub.3 "(1)s.sub.1 (1)-s.sub.3 (1)!.sup.T  x.sub.1 u(0)-p(1)!=0(21)

Evaluating the inner products, obtain

    x.sub.1  u.sub.2 "(1)g.sub.11 -g.sub.21 !-u.sub.2 "(1)g.sub.12 +g.sub.22 =0(22)

    x.sub.1  u.sub.3 "(1)g.sub.11 -g.sub.31 !-u.sub.3 "(1)g.sub.12 +g.sub.32 =0(23)

where the g_(ij) are elements of the 3-by-2 matrix

    G= g.sub.ij !=S(1) u(0)|p(1)!                     (24)

By eliminating x₁ between Equations (22) and (23) and simplifying, ##EQU7## where u₂ and u₃ are shorthand for u₂ "(1) and u₃ "(1), respectively. Hence, the determinant in Equation (25) vanishes when motion parameters p(1), φ₁ (1), φ₂ (1) and φ₃ (1) are consistent with the observed feature locations u(0) and u(1). However, the solution of Equation (25) is under-determined, since there are three unknowns. This is overcome by adjoining equations for other features. Adopting a least squares approach, it is practical to define the nonnegative cost function for K features (K≧3) as ##EQU8## where Δ_(k) is the determinant for feature k. The method of the preferred embodiment, thus, is to determine the Euler angles which minimize the cost function J.

The problem of minimizing Equation (26), however, is nonlinear and must be solved iteratively. Newton's method or other gradient-based approaches can be applied as analytical expressions for the derivatives can be derived. However, efficient procedures are available that avoid the need for differentiation, for example, Powell's method as described in M. J. D. Powell, "An Efficient Method of Finding the Minimum of a Function of Several Variables Without Calculating Derivatives," The Computer Journal, 7, 155-162, 1964, and G. R. Walsh, Methods of Optimization, John Wiley and Sons, 1975, pp. 129-138, provides an applicable optimization technique. An illustrative embodiment uses a two-stage search that the following discussion describes.

Testing of the preferred embodiment has been performed with synthetic data. The minimum of the cost function J was sought by searching the three-dimensional solution space about a nominal set of Euler angles, such as might be provided by an INS. Experiments were repeated for several levels of measurement noise.

The resolution and FOV of the sensor will affect performance. The sensor model of the tests assumed an array of 480 rows and 960 columns of light sensitive elements or pixels. Each pixel is square and subtends an angle of 0.1 milliradian. This corresponds to a pixel dimension of 10 microns and a focal length of 10 centimeters, for example. The horizontal FOV is 96 milliradians or 5.5 degrees. The narrow FOV makes it difficult to maintain a feature within the FOV for long.

The pose of the sensor at times 0 and 1 also affects performance. The locations of the COF at times 0 and 1 were chosen as follows: ##EQU9##

Attitude was specified as a roll angle φ₁ and the normal n to the imaging array. At time 0, roll was assumed to be 2.5 degrees and the normal, ##EQU10##

At time 1, roll was assumed to be 5.0 degrees. The normal was not specified directly, but was computed by requiring that the fixation point be identical to that at time 0. In this calculation, fixation points were defined at zero attitude. Computing n(1) in this manner increases the probability that a feature will be visible at both locations.

A contributor to measurement error is quantization of (u₂ ", u₃ ") to the discrete pixel location (j,i). While sub-pixel interpolation may reduce this error, the tests of the preferred embodiment do not model sub-pixel interpolation. To model other tracker errors, Gaussian white noise was added to the feature locations before quantization. The noise had zero mean. The variance was experimentally controlled.

Feature locations in the imaging array at time 0 were picked stochasticly using a Gaussian random number generator. The nature of the distribution caused feature candidates to cluster near the center of the array. A ray from the COF was struck through the center of the pixel and projected until it reached an attitude h, another random variable which modeled the topography of the terrain. This point was then back-projected to the COF at time 2 and the intersection with the imaging plane noted. The location of the intersection was corrupted with noise and quantized to a pixel location. If the location was within the FOV, the feature was accepted. If not, it was discarded and a new candidate was processed. Candidates were processed until eight features had been defined.

Since the coordinate system at time 0 is not canonical, the derivations of the preceding section must be modified. Fortunately, it is only necessary to replace S(1) with S(1)S^(T) (0) and p(1) with S(0) p(1)-p(0)!.

                  TABLE 1     ______________________________________     Results (deg) with resolution = 1.0 deg.              RMS Noise (pixels)     Euler Angle                0.2680       0.6867  1.7340     ______________________________________     Roll     True       5.00         5.00    5.00     Estimated  4.10         4.20    4.40     Scatter    0.46         0.60    0.60     Pitch     True       70.10        70.10   70.10     Estimated  70.00        70.00   70.00     Scatter    0.04         0.04    0.05     Yaw     True       93.80        93.80   93.80     Estimated  92.50        92.60   92.60     Scatter    0.47         0.42    0.46     ______________________________________

A two-stage search was employed to seek the minimum value of the cost function J. Resolution of the first stage was specified interactively and was either 1.0 degree or 0.1 degree. This is the resolution referred to in the titles of the Tables 1 through 4, herein. The second stage resolution was 10 times the first stage resolution, i.e., 0.1 degree and 0.01 degree, respectively. The range searched was ±5 resolution elements about a nominal. To allow for possible entrapment at local minima, the Euler angles associated with the 15 lowest values of J were stored and used as nominal values in the second stage. Results of the second stage searches were sorted to extract the five results with the smallest value of J. The medians of the Euler angles in this group were taken as the estimates. As a measure of confidence, the RMS deviation of the individual estimates from the median was computed. Table 1 lists these measurements as the "scatter."

Table 1 summarizes the correlation results obtained with a first stage resolution of 1.0 degree. Note that good estimates of pitch are obtained throughout, while roll and yaw have slightly lower correlations with their true values. The scatter is a good qualitative indicator of accuracy. Table 2 gives values of the cost function J for the true Euler angles and those yielding the minimum value of J. For the lowest level of measurement noise, the true Euler angles yielded a smaller value for J that the minimizing values. This indicates that the search did not uncover the neighborhood of the true minimum. To see if this could be corrected by increasing the resolution of the search, resolution was increased by 10 and the experiment was then repeated. The correlation results are shown in Table 3. Note that errors in roll and yaw are greatly diminished. Additionally, the low scatter suggests reliable estimates. Values of the cost function of this search are given in Table 4. As expected, values for the minimizing Euler angle are smaller, especially the one for the lowest measurement noise.

                  TABLE 2     ______________________________________     J(10.sup.6) with resolution = 1.0 deg.              RMS Noise (pixels)     Euler Angles                0.2680       0.6867  1.7340     ______________________________________     True       1.4363       4.2261  7.3928     Minimizing 2.3221       3.6228  4.5824     ______________________________________

                  TABLE 3     ______________________________________     Results (deg) with resolution = 1.0 deg.              RMS Noise (pixels)     Euler Angle                0.2680       0.6867  1.7340     ______________________________________     Roll     True       5.00         5.00    5.00     Estimated  4.96         4.86    5.25     Scatter    0.04         0.06    0.05     Pitch     True       70.10        70.10   70.10     Estimated  70.10        70.10   70.10     Scatter    0.00         0.00    0.00     Yaw     True       93.80        93.80   93.80     Estimated  93.75        93.75   93.75     Scatter    0.01         0.03    0.01     ______________________________________

                  TABLE 4     ______________________________________     J(10.sup.6) with resolution = 1.0 deg.              RMS Noise (pixels)     Euler Angles                0.2680       0.6867  1.7340     ______________________________________     True       1.4363       4.2261  7.3928     Minimizing 1.3736       3.4680  4.3660     ______________________________________

In summary, the preferred embodiment illustrates the use of image manipulation to improve the sensor pose determinations by more accurately producing sensor attitude measurements. By providing an exact, discrete time formulation of the attitude determination problem, the preferred embodiment enhances methods of passive ranging.

Although the present invention and its advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims. 

What is claimed is:
 1. A method for inferring precise sensor attitude information in a tracking sensor system, comprising the steps of:sensing at a first time a reference image using the tracking sensor to provide signals representing reference feature locations in the reference image; storing in a memory signals representing said reference feature locations; sensing at a second time a second image using the tracking sensor to provide signals representing second time sensed feature locations; determining the relative translational position of the tracking sensor at said second time relative to the sensor position at said first time without using any information extracted from said second image; and determining the pose of said tracking sensor by using mathematical steps to form a correlation between a plurality of said second time sensed feature locations in said second image and a plurality of said reference feature locations in said reference image, wherein determining the sensor nose comprises using said relative translational position to improve the conditioning of at least one mathematical step;whereby said pose includes the precise sensor attitude information.
 2. The method of claim 1, wherein said storing step comprises the step of storing data from a correlation template of the reference image at said first time.
 3. The method of claim 1, wherein said correlation forming step comprises the step of applying discrete time formulation relationship between said reference image and said second image.
 4. The method of claim 1, wherein said forming step includes the step of forming said correlation in accordance with Euler angles which minimize a cost function, J, where ##EQU11## where K represents a count of reference features, and where Δ_(k) represents a determinant of a particular reference feature, k.
 5. The method of claim 1, wherein said relative translational position determining step uses data from an inertial navigation system.
 6. The method of claim 1, wherein said images comprise Forward Looking Infrared images.
 7. The method of claim 1, wherein said images comprises air-to-ground RF emitter location images.
 8. The method of claim 1, further comprising the step of providing said sensor attitude information as an input to a passive ranging system.
 9. The method of claim 1, wherein said forming a correlation comprises using least-squares estimation.
 10. The method of claim 1, wherein said determining the relative position step uses a location sensor that is flexibly coupled to said tracking sensor.
 11. The method of claim 10, wherein said location sensor is an inertial navigation system.
 12. The method of claim 1, wherein said determining the relative translational position step determines only the sensor's relative translational position. 